AITopics | independent replication

Collaborating Authors

independent replication

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Nested Atoms Model with Application to Clustering Big Population-Scale Single-Cell Data

Chakrabarti, Arhit, Ni, Yang, Jiang, Yuchao, Mallick, Bani K.

arXiv.org Machine LearningApr-14-2026

We consider the problem of clustering nested or hierarchical data, where observations are grouped and there are both group-level and observation-level variables. In our motivating OneK1K dataset, observations consist of single-cell RNA-sequencing (scRNA-seq) data from 982 individuals (groups), totaling 1.27 million cells (observations), along with individual-specific genotype data. This type of data would enable the identification of cell types and the investigation of how genetic variations among individuals influence differences in cell-type profiles. Our goal, therefore, is to jointly cluster cells and individuals to capture the heterogeneity across both levels using cell-specific gene expressions as well as individual-specific genotypes. However, existing grouped clustering methods do not incorporate group-level variables, thereby limiting their ability to capture the heterogeneity of genotypes in our motivating application. To address this, we propose the Nested Atoms Model (NAM), a new Bayesian nonparametric approach that enables the desired two-layered clustering, accounting for both group-level and observation-level variables. To scale NAM for high-dimensional data, we develop a fast variational Bayesian inference algorithm. Simulations show that NAM outperforms existing methods that ignore group-level variables. Applied to the OneK1K dataset, NAM identifies clusters of genetically similar individuals with homogeneous cell-type profiles. The resulting cell clusters align with known immune cell types based on differential gene expression, underscoring the ability of NAM to capture nested heterogeneity and provide biologically meaningful insights.

artificial intelligence, group-level variable, machine learning, (15 more...)

arXiv.org Machine Learning

2604.11731

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Texas > Travis County > Austin (0.04)
North America > United States > New York (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre:

Research Report > New Finding (0.46)
Research Report > Experimental Study (0.46)

Industry:

Health & Medicine > Therapeutic Area > Immunology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.66)

Add feedback

Statistical Model Checking of NetLogo Models

Pangallo, Marco, Giachini, Daniele, Vandin, Andrea

arXiv.org Artificial IntelligenceSep-16-2025

Agent-based models (ABMs) are gaining increasing traction in several domains, due to their ability to represent complex systems that are not easily expressible with classical mathematical models. This expressivity and richness come at a cost: ABMs can typically be analyzed only through simulation, making their analysis challenging. Specifically, when studying the output of ABMs, the analyst is often confronted with practical questions such as: (i) how many independent replications should be run? (ii) how many initial time steps should be discarded as a warm-up? (iii) after the warm-up, how long should the model run? (iv) what are the right parameter values? Analysts usually resort to rules of thumb and experimentation, which lack statistical rigor. This is mainly because addressing these points takes time, and analysts prefer to spend their limited time improving the model. In this paper, we propose a methodology, drawing on the field of Statistical Model Checking, to automate the process and provide guarantees of statistical rigor for ABMs written in NetLogo, one of the most popular ABM platforms. We discuss MultiVeStA, a tool that dramatically reduces the time and human intervention needed to run statistically rigorous checks on ABM outputs, and introduce its integration with NetLogo. Using two ABMs from the NetLogo library, we showcase MultiVeStA's analysis capabilities for NetLogo ABMs, as well as a novel application to statistically rigorous calibration. Our tool-chain makes it immediate to perform statistical checks with NetLogo models, promoting more rigorous and reliable analyses of ABM outputs.

artificial intelligence, independent replication, replication, (14 more...)

arXiv.org Artificial Intelligence

2509.10977

Country:

Europe > Denmark (0.28)
Europe > Italy (0.28)

Genre: Research Report > Experimental Study (0.93)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Add feedback

Concentration Bounds for Optimized Certainty Equivalent Risk Estimation

Ghosh, Ayon, Prashanth, L. A., Jagannathan, Krishna

arXiv.org Machine LearningMay-31-2024

We consider the problem of estimating the Optimized Certainty Equivalent (OCE) risk from independent and identically distributed (i.i.d.) samples. For the classic sample average approximation (SAA) of OCE, we derive mean-squared error as well as concentration bounds (assuming sub-Gaussianity). Further, we analyze an efficient stochastic approximation-based OCE estimator, and derive finite sample bounds for the same. To show the applicability of our bounds, we consider a risk-aware bandit problem, with OCE as the risk. For this problem, we derive bound on the probability of mis-identification. Finally, we conduct numerical experiments to validate the theoretical findings.

estimation, risk measure, theorem 3, (15 more...)

arXiv.org Machine Learning

2405.20933

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Spain (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.68)
Information Technology > Data Science > Data Mining > Big Data (0.48)

Add feedback

A network-constrain Weibull AFT model for biomarkers discovery

Angelini, Claudia, De Canditiis, Daniela, De Feis, Italia, Iuliano, Antonella

arXiv.org Machine LearningFeb-28-2024

We propose AFTNet, a novel network-constraint survival analysis method based on the Weibull accelerated failure time (AFT) model solved by a penalized likelihood approach for variable selection and estimation. When using the log-linear representation, the inference problem becomes a structured sparse regression problem for which we explicitly incorporate the correlation patterns among predictors using a double penalty that promotes both sparsity and grouping effect. Moreover, we establish the theoretical consistency for the AFTNet estimator and present an efficient iterative computational algorithm based on the proximal gradient descent method. Finally, we evaluate AFTNet performance both on synthetic and real data examples.

aft model, penalty, weibull aft model, (14 more...)

arXiv.org Machine Learning

2402.18242

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > California > Santa Clara County > Stanford (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback